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INVERSION  ALGORITHMS  FOR  GEOPHYSICAL  PROBLEMS 


I .  INTRODUCTION 

The  purpose  of  this  note  is  to  give  a  brief  introduction  to  the  concept 
of  inversion  and  to  discuss  a  few  different  approaches  that  have  been 
considered.  After  some  analytical  preliminaries,  we  deal  with  two  specific 
algorithms  that  have  been  used  for  the 

(1)  inversion  of  gravity  data  in  order  to  retrieve  the  underlying 
topographic  profile,  and 

(2)  inversion  of  radar  measurements  in  order  to  retrieve  ocean  spectra. 

Further  details  and  developments  on  this  topic  will  constitute  the 
subject  of  a  future  report. 

II.  ANALYTICAL  CONSIDERATIONS 

The  inversion  problem  is  analytically  akin  to  the  solution  of  a 
Fredholm  integral  equation  of  the  first  kind: 

b 

y<9)  -  |  f(OK($,i|)d£  +  e(r|)  .  (1) 

a 

Here  a,b  are  given  constants  and  f(£)  is  the  unknown  function  which,  when 
averaged  over  the  closed  interval  (a,b)  according  to  the  kernel  K(^ ,  77 )  ,  must 
reproduce  the  observed  function  y (tj)  allowing  for  a  random  error  e ( 77 )  .  In 
many  applications  the  kernel  of  the  integral  equation  is  referred  to  as  the 
"transfer  function"  of  the  physical  problem. 

Methods  of  solution  for  eq.  (1)  can  be  helpful  in  elucidating  the 
procedure  in  cases  of  geophysical  interest. 

We  can  express  the  solution  of  eq.  (1)  as  a  linear  combination  of  a  set 
of  functions  according  to  the  scheme: 


m 

f(0  =  2  x#  (o  +  A O;  (2) 

j-i  J  J 


the  choice  of  the  -set  {<^j  (£)  }  and  the  number  m  of  functions  will  be  made 
according  to  the  specific  problem  and  can  be  left,  at  least  for  now,  open. 
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Substituting  eq.  (2)  into  eq.  (1)  and  assuming  that  we  know  the  values 
y (?7i)  “  y i  of  y(rj)  at  n  distinct  points,  we  get  a  (n  x  m)  linear  system: 


where 


2  A. .  x.  +  e. ,  (i 

j-i  J  1 


b 

a 


and  the 


b 

e(r?i>  +  J  fiemt.njde 

a 


(3) 


are  known  as  the  "error"  terms. 

The  suitability  of  the  set  of  functions  and  the  number  m  of  the 
components  of  the  set  are  two  factors  that  vary  according  to  the  physics  of 
the  problem;  also  the  nature  of  the  kernel  function  is  dictated  by  the 
specific  physical  problem. 

Thus,  according  to  this  procedure,  the  solution  of  the  integral 
equation  has  been  reduced  to  a  linear  system.  One  question  which  is 
appropriate  to  clarify  is  the  significance  of  the  parameters  xj .  We  can 
show  that  the  Xj  represent  the  values  of  the  integrals  of  the  unknown 
function  f(£)  according  to  the  basis  functions  (O  :  in  other  words,  we  can 
say  that  the  Xj  are  the  moments  of  f(£)  with  respect  to  the  (f)  provided 
that  <£*(£)  be  chosen  to  be  orthogonal  to  the  elements  of  the  basis,  i.e. 

b 

|  **<o*k(e)de  -  o  .  (4) 

a 


One  possible  choice  is  to  assume  the  basis  to  be  of  the  ’’boxcar"  type,  i.e. 


*j«> 


in  which  case,  we  have 


£(OW. 


=  1  for  £  <£<£., 

=  0  outside  the  interval; 


and  this  is  the  average  of  f  (£)  within  the  interval  . 
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The  number  of  parameters  required  for  an  adequate  representation  will 
depend  upon  the  smoothness  of  the  unknown  function  in  the  given  interval. 
Most  of  the  existing  methods  are  based  upon  constructing  linear  averages  of 
the  unknown  function  whose  values  are  defined  by  empirical  data.  This  is 
essentially  the  approach  first  introduced  by  Backus  and  Gilbert  (1968, 

1970). 

It  has  long  been  recognized  that  inverse  problems  do  not  yield  a  unique 
solution.  This  is  due  to  the  fact  that  there  exist  subsets  of  solutions 
satisfying  the  homogeneous  integral  equation  which  might  not  correspond  to 
geophysical  reality  but  which  do  contribute  to  the  general  solution  of  the 
nonhomo gene ous  data  equations.  Steps  must  be  taken  to  eliminate  them. 

One  such  step  is  to  minimize  some  suitable  functional  of  the  variables 
and  errors.  Thus,  to  eq.  (3)  we  could  add  the  "variational"  condition  that 
an  expression  of  the  following  kind 

n 

q«Z  [f(x)+h(e)]  (5) 

r-1 


be  a  minimum. 

Another  possible  step  is  the  incorporation  within  the  original  system 
of  a  priori  information  by  treating  the  unknowns  as  Gaussian  random 
variables  having  mean  values  x  and  Cx  as  covariance  matrix.  To  elaborate 
this  point,  let  us  rewrite  eq.  (3)  in  vector  notation 

y  -  Ax  +  e  (6) 

where  A  is  a  matrix,  and  let  us  suppose  that  we  can  build  some  averages  of 
our  data  set  which  might  have  a  useful  geophysical  interpretation  and  be 
stable  functions  of  the  data.  Let  us  denote  this  averaging  operation  by 

A 

-+  -»■ 

X  -  Hy, 

so  that  we  can  write 

A 

x  -  HAx  +  He. 

We  can  then  construct  the  "estimation  error"  as 

A 

x  -  x  -  (HA  -  I)  x  +  He  (7) 

where  I  is  the  identity  matrix. 

If  we  assume  a  covariance  matrix  Cx  for  the  parameters  x  and  a 
covariance  matrix ^Ce  for  the  errors  e,  then  the  covariance  matrix  C  for  the 
estimation  error  x  -  x  is  given  by 

c  -  (HA  -  I)CX(HA  -  I)T  +  HCeHT ,  (8) 

where  the  upper  T  denotes  the  transpose  matrix  operation.  We  are  assuming 
here  that  the  a  priori  estimates  xQ  are  statistically  independent  of  the 
data  errors  e. 
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The  matrix  H  which  minimizes  the  diagonal  terms  in  C  can  be  shown  to  be 

H  -  Cx  AT(ACxAT  +  Cg)'1  (9) 

by  using  procedures  akin  to  least  squares  problems.  This  matrix  is  known  as 
the  "minimum  variance  estimator"  and  has  been  used  for  averaging  purposes 
among  others  by  Jackson  (1979). 

III.  INVERSION  OF  GRAVITY  DATA 

We  must  devise  an  algorithm  that  will  allow  us  to  determine  the  density 
profile  of  various  lower  layers  based  upon  certain  gravity  anomaly 
measurements  taken  at  the  upper  surface.  Such  an  inversion  scheme  can  be 
applied  to  ascertain  the  topography  of  seamounts  and  the  structuring  of  the 
uppermost  layers  of  the  ocean  floor. 

We  assume  a  two-dimensional  fixed  geometry  consisting  of  an  array  of 
identical  rectangular  blocks  of  variable  densities.  We  denote  by  x^  the 
abscissa  of  the  i-th  data  point  and  by  (xi,zi)  the  coordinates  of  the 
centroid  of  the  j-th  block  of  dimensions  and  of  an  as  yet  unknown 

density  <Sj  . 

For  a  fixed  depth  Zj ,  we  reach  a  linear  system 

g  -  A?  +  e  (10) 

where  g  is  the  vector  of  the  given  measurements  g^  (i-1 , 2 , . . . , n) ,  8  is  the 
vector  of  the  unknown  densities  8a  (j-1 , 2 , . . , ,m)  and  e  is  the  vector  of  the 
noise  e^  associated  with  each  of  the  data  points.  A  is  a  matrix  whose 
elements  a^j  represent  the  influence  of  the  j-th  block  on  the  i-th  gravity 
value,  and  can  be  represented  in  terms  of  the  dimensions  of  the  block  and 
its  relative  position  with  respect  to  the  location  of  the  measurement.  It 
is  essentially  the  gravity  generated  by  a  vertical  slab.  Its  expression  is 
available  in  Lanzano  (1984). 

It  is  a  very  ascertained  fact,  see,  e.g.  Fisher  and  Howard  (1980),  that 
the  above  linear  system  of  equations  when  considered  in  its  square  form 
(i.e.  when  the  number  of  data  points  equals  the  number  of  blocks)  is  a 
singular  system.  We  must,  therefore,  take  into  account  a  situation  whereby 
the  number  m  of  blocks  is  larger  than  the  number  n  of  data  points  which 
allows  for  an  infinity  of  solutions. 

In  order  to  bracket  our  solution  and  exclude  many  density  distributions 
that  lack  geophysical  significance,  we  shall  make  recourse  to  a  minimum 
principle.  We  impose  to  minimize  a  suitable  functional  of  densities  and 
errors.  Specifically,  we  choose  here  to  minimize  the  following  quadratic 
form  of  densities  and  errors 

-►?  ->2 

f  =  Wc<$  +  W  e  (11) 

8  e  v  7 

with  appropriate  weighting  matrices  of  diagonal  form.  This  functional 
expression  was  first  introduced  by  Last  and  Kubik  (1983) .  The  physical 
significance  of  this  condition  will  become  evident  from  what  follows. 
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The  area  of  the  configuration  is  given  by 


/  in 

Area  =-  dh  Lim  I  2 

e~*Q  \j=i 


because 


Lim 

€-0 


when  5^0 
when  <5=» 0  . 


Consequently,  by  choosing 


w^1  -  Diag  (e  +  sj) 


(12) 


(13) 


we  can  minimize  the  area  of  the  model,  i.e.,  maximize  its  compactness,  by 
adopting  a  very  small  value  for  €.  An  algorithm  leading  to  the  most  compact 
configuration  presents  many  attractive  features  because  it  can  be  used  to 
enhance  the  contrast  between  the  denser  seamounts  and  the  lighter  ocean 
water  for  each  solution  of  our  system. 


Once  the  weighting  matrices  are  fixed,  we  can  solve  our  linear  system 
(for  m>n)  subject  to  the  condition  that  the  functional  be  a  minimum  by  means 
of  a  least  squares  procedure.  Specifically,  we  find  that 


-I  T  .  -I  T  -l\  -1 
S  -  A  (AW5  a  +  J  g 

W^1  -  Diag  (e  +  6^)  . 


(14) 


This  turns  out  to  be  a  nonlinear  problem,  because  the  weighting  matrix 
which  appears  in  the  resolvent  equation  depends  on  the  5 j ' s  which  are 
still  unknown  .  This  can  be  overcome  by  performing  an  iterative  algorithm: 
we  can  choose  the  zero-order  approximation  to  be 

W^0)  —  I  (identity  matrix) 


and  subsequently  we  must  have 
•1 


W 


(k-1) 


£  + 


JJ 


(k-1) 


i  2 


when  k>l  . 


The  k-th  order  iteration  can  then  be  written  as  follows: 


W 


(k-1) 


1  -1 


T  \ 

A  (  A 


W 


(k-1) 


1-1 


T 

A  + 


W 


(k-1) 


-1 


g  •  (15) 
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In  dealing  with  the  noise  terms,  it  has  been  found  appropriate,  see  also 
Jackson  (1979) ,  to  adopt  a  weighting  matrix  of  the  sort 


-12  -IT 

W  -  l  Diag  (A  Wc  A  ) 
e  o  &  8 


(16) 


where  lQ  is  an  apriori  estimate  of  the  noise  to  signal  ratio.  The  signal 
being  that  part  of  the  observed  anomaly  which  is  attributable  to  the 
structure  being  investigated. 

The  above  matrix  is  independent  of  the  errors  themselves.  Subsequent 
iterations  will  give  rise  to 


where 

-1  T 

D  -  AW  1  A  . 
o 

Convergence  is  deemed  to  have  been  achieved  when  further  iterations  do  not 
appreciably  alter  the  density  distribution. 

The  constant  e  should  be  chosen  as  small  as  possible  but  compatible 
with  the  computer  so  as  not  to  cause  any  instability. 

This  algorithm  can  be  upgraded  to  include  an  upperbound  for  the  density 
values;  i.e.,  we  can  impose  the  condition  6j<S*  for  every  j=l , 2 , . . . ,m  and 
for  each  depth.  The  algorithm  must  reset  equal  to  8*  the  density  of  any 
block  that  goes  beyond  this  preassigned  limit  and  will  downplay  that  block 
in  the  following  step  of  the  iterative  procedure.  This  can  be  achieved  by: 
(1)  subtracting  the  gravity  contribution  for  that  particular  block  from  the 
total  gravity  anomaly  (data  set),  and  (2)  assigning  a  very  small  weight  to 
that  block  so  that  its  contribution  will  be  negligible. 

This  is  essentially  a  Penalty  Function  approach  and  can  be  performed  by 
using  a  unit  step  function,  see  Lanzano  and  Myers  (1985).  Thus  a  small 
correction  is  applied  to  those  blocks  that  had  reached  the  limiting  density 
5*;  should  this  correction  be  negative,  bringing  the  total  density  of  those 
blocks  below  the  limit,  the  normal  weighting  functions  should  be  used  then, 
allowing  those  blocks  to  be  used  freely  in  the  minimization  procedure. 

This  algorithm  has  been  tested  at  NRL  on  synthetic  data  using  a  small 
computer  (HP7000) .  Stability  and  convergence  required  at  least  12 
iterations.  The  length  of  the  computations  did  suggest  that  use  of  a  faster 
computer  is  certainly  desirable. 

IV.  INVERSION  OF  RADAR  MEASUREMENTS 

Another  problem  of  interest  concerns  the  determination  of  ocean  spectra 
obtainable  by  means  of  experiments  which  depend  on  the  direction  of  the  wave 
propagation.  If  B  is  the  direction  of  propagation  for  a  wave  of  frequency 
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f,  the  spectrum  S  (0,f)  is  related  to  the  data  vector  d  via  an  integral 
equation  of  the  sort: 

2tt 

d  -J  S(8,f )  k(0)  dff  +  e  .  (17) 

o 


Here  k  is  the  vector  kernel  of  the  physical  process  and  e  is  a  random  error 
vector. 

The  inverse  problem  consists  of  determining  S  by  solving  from  eq.  (17) 
once  the  d  and  e  vectors  are  given.  This  inversion  cannot  be  unique,  since 
the  directional  distribution  is  a  continuous  function,  whereas  the  data  set 
d  is  of  finite  dimension.  The  most  effective  method  of  solution  is  to  add 
to  the  integral  equation  a  functional  of  related  variables  to  be  minimized. 

We  choose  the  functional  to  depend  on  the  errors,  furthermore  it  should 
establish  a  minimum  variance  of  the  spectrum  from  a  particular  preferred 
spectrum  SQ  (i.e.  the  model  to  be  expected)  and  must  also  express  the 
positiveness  of  the  spectrum.  This  outlook  has  been  adopted  and  elaborated 
upon  primarily  by  Long  and  Hasselmann  (1979). 

The  functional  to  be  minimized  will  be  written  as 

2  7T  2x 

r)  -  eT  Qe  +  a  J  (S-|s|)2d0  +  0  J  (S-Sq)2  d 6,  (18) 

o  o 

where  Q  is  a  symmetric,  positive  definite  square  matrix  such  that  its 
inverse  V-Q"1  is  the  covariance  matrix  of  the  errors 

-+  -+T 

V  *  <e  e  >  . 


This  covariance  matrix  V  can  be  determined  using  standard 
techniques  of  time  series  analysis  once  some  assumptions  have  been  made 
concerning  the  distribution  of  errors. 


In  eq.  (18),  a  and  f)  are  two  weighting  parameters  whose  values 
represent  the  relative  importance  of  the  conditions  to  which  they  are 
attached.  They  can  be  considered  as  Lagrangian  multipliers  in  the  sense  of 
the  classical  calculus  of  variations  approach,  which  is  aimed  at  minimizing 
a  functional  to  which  auxiliary  conditions  have  been  appended. 

The  selection  of  the  Q  matrix  establishes  a  probability  region  around 
the  domain  d  such  that 


1 

p(e,d)  - 

J  2n 


exp 


(19) 
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represents  the  probability  of  obtaining  the  values  d  +  e. 


We  must  therefore  study  the  variation  exhibited  by  the  functional 


»7<S,  |S|  ,e) 

caused  by  all  possible  variations  <5S  of  the  spectrum  S(0,f)  within  a  class 
of  allowable  solutions  to  eq.  (17). 


For  this  purpose, 
eq.  (17) ,  one  has 


let  us  recall  that  e  and  S  are  related  because,  from 


2  n 

e  -  d  -  f  S(0,f)k(0)d 9  . 


Since  neither  the  vector  k  (i.e.,  the  physical  process)  nor  the  vector  d 
(i.e.,  the  data  points)  are  supposed  to  undergo  any  variation,  the 
variations  SS  and  8e  can  be  related  according  to 


2tt 

Se  -  -  J  [k(0)<5S]  dd  . 

O 


(20) 


Thus,  the  variation  5rj  can  be  written  in  terms  of  partial  derivatives  and  we 
reach  an  expression  of  the  sort 

(dr\  dV  de\ 

We  get 


d|s|\ 

-  ^)ssi  d9 


2n 

+  P  |  [2(S  -  Sq)  SS]  dS  +  2eT  Q  5e  . 
o 


When  S>0,  the  integrand  in  the  first  term  will  vanish  identically;  on  the 
other  hand  when  S<0,  we  have 


d|s| 

dS 


=  -1  . 


We  can  therefore  write  in  both  cases  that  8r)=* 0  is  equivalent  to 


8 


2tt 

|  { [ 4a ( S -  | S |  )  +  2j3(S-Sq)  -  2eT  Qk]5S)  dd  =  0, 

o 

where  use  has  been  made  of  eq.  (20)  to  eliminate  the  Se 

Due  to  the  arbitrariness  of  6 S,  the  integrand  must  vanish,  leading  to 

4a ( S -  |  S  | )  +  2/3(S-Sq)  =  2eT  Qk  . 


When  S>0,  we  get 


S  -  So  -  f  e  <*  k  ■ 

Whereas ,  when  S<0 ,  we  have 


(8a  +  2/?)S  -  20Sq  +  2eT  Qk  , 


P 


or 


4a  +  f3  o  fi 


-►T  -► 

(S  +  -  e  Q  k) 


(21) 


(22) 


Knowing  k  and  assuming  a  certain  e,  we  can  thus  determine  the  function  S 
which  minimizes  the  r?  functional  provided  that  we  choose  the  two  parameters 
(*,0). 


Let  us  rewrite  the  last  term  within  parenthesis  in  an  equivalent  but 
more  convenient  way.  Since 


-►T 

e 


Qk 


is  a  scalar,  it  must  coincide  with  its  transpose.  Applying  the  property 
that  the  reverse  order  of  the  transposed  factors  is  used  in  performing  the 
transpose  of  a  product  of  matrices,  we  can  write 


-+T  -►T  -►  T  -►T  T-+  -►T  -► 

e  Qk  —  (e  Qk)  -  k1  Qle  -  k1  Qe  ;  (23) 

the  last  step  is  due  to  the  fact  that  Q  is  a  symmetric  matrix. 

We  must  now  impose  that  the  condition  pertaining  to  the  positiveness  of 
S  be  verified  exactly:  this  necessitates  to  let  a  approach  infinity  within 
eq.  (18). 


Eq.  (22)  will  then  vanish;  whereas  eq.  (21)  can  be  rewritten,  through 
the  use  of  eq .  (23),  as 


S 


S 

o 


+  -  (k  Qe) 


(24) 


9. 


This  function  S,  however,  should  also  be  a  solution  of  eq.  (17);  we  must 
therefore  substitute  S  from  eq.  (24)  into  eq.  (17).  In  doing  so,  we  get 


2n 


-I 


(S  k)  d 9 
o 


e  + 


1  2* 
J  \ 


-t-T  - 1  -+  -* 

(k  V  e)  k  d« 


(25) 


The  left-hand  side  of  eq.  (25)  is  a  known  quantity;  both  e  and  j3 ,  however, 
are  still  unknown.-  The  above  equation  turns  out  to  be  a  Fredholm  integral 
equation  of  the  second  kind  with  respect  to  e  with  variable  parameter  /?;  in 
it  the  unknown  function  e  appears  both  inside  and  outside  the  integral  sign. 

Equation  (25)  can  be  solved  by  iterative  procedures,  noting  however 
that  any  initial  choice  for  e  or  subsequent  iterations  should  necessarily 
imply  the  determination  of  the  covariance  matrix  V.  Once  e  is  determined  by 
solving  eq.  (25),  eq.  (24)  will  provide  the  desired  spectrum  S. 

Various  possibilities  exist  for  solving  eq.  (25)  and  they  will  be 
discussed  extensively  in  a  future  report.  Our  present  analysis  has  taken  us 
from  the  original  eq.  (17)  to  this  final  eq.  (25)  through  the  process  of 
assuming:  (1)  only  positive  values  for  S,  and  (2)  a  minimum  variation  of  S 
from  an  expected  function  SQ. 

By  imposing  these  physical  limitations  which  are  of  plausible 
acceptance,  we  have  gained  a  computational  advantage  because  eq.  (25)  is 
easier  to  solve  than  the  original  eq.  (17). 


An  equation  similar  to  our  eq.  (25)  above  was  used  by  Long  and 
Hasselmann  (1979)  to  measure  the  directional  properties  of  swells  in  shallow 
waters  for  the  purpose  of  comparing  various  swell  decay  models  available. 
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